Text Classification and Labelling of Document Clusters with Self-Organising Maps

نویسندگان

  • Andreas Rauber
  • Erich Schweighofer
  • Dieter Merkl
چکیده

The freely available law on the Internet could be one of the best application areas of text classification and labelling. This paper explores the high potential of the self-organising map for information reconnaissance by classifying and describing unknown legal text collections. The maps can be seen as topic-oriented libraries that are automatically created without intellectual input. The clustered topics units of the self-organising map are labelled with the most appropriate keywords. Extensive tests have shown the potential of this approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the Quality of Labels for Self-Organising Maps Using Fine-Tuning

Vector representation of legal documents is still the best way for computing classification clusters and labelling of its contents. A very special problem occurs with self organising maps: strong clusters tend to dominate neighbouring smaller clusters in terms of their weight vector structure, which influences the labels extracted from these. This unwelcome side-effect can be overcome efficient...

متن کامل

On Document Classification with Self-Organising Maps

This research deals with the use of self-organising maps for the classification of text documents. The aim was to classify documents to separate classes according to their topics. We therefore constructed self-organising maps that were effective for this task and tested them with German newspaper documents. We compared the results gained to those of k nearest neighbour searching and k-means clu...

متن کامل

Self-Organising Maps for Hierarchical Tree View Document Clustering Using Contextual Information

In this paper we propose an effective method to cluster documents into a dynamically built taxonomy of topics, directly extracted from the documents. We take into account short contextual information within the text corpus, which is weighted by importance and used as input to a set of independently spun growing Self-Organising Maps (SOM). This work shows an increase in precision and labelling q...

متن کامل

Self-Organising Maps in Document Classification: A Comparison with Six Machine Learning Methods

This paper focuses on the use of self-organising maps, also known as Kohonen maps, for the classification task of text documents. The aim is to effectively and automatically classify documents to separate classes based on their topics. The classification with self-organising map was tested with three data sets and the results were then compared to those of six well known baseline methods: k-mea...

متن کامل

Learning Document Image Features With SqueezeNet Convolutional Neural Network

The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000